Learning a Transferable World Model by Reinforcement Agent in Deterministic Observable Grid-World Environments
نویسندگان
چکیده
Reinforcement-based agents have difficulties in transferring their acquired knowledge into new different environments due to the common identities-based percept representation and the lack of appropriate generalization capabilities. In this paper, the problem of knowledge transferability is addressed by proposing an agent dotted with decision tree induction and constructive induction capabilities and relying on decomposable properties-based percept representation. The agent starts without any prior knowledge of its environment and of the effects of its actions. It learns a world model (the set of decision trees) that corresponds to the set of explicit action definitions predicting action effects in terms of agent’s percepts. Agent’s planning component uses predictions of the world model to chain actions via a breadth-first search. The proposed agent was compared to the Q-learning and Adaptive Dynamic Programming based agents and demonstrated better ability to achieve goals in static observable deterministic gridworld environments different from those in which it has learnt its world model.
منابع مشابه
Analysis of Memory-Based Learning Schemes for Robot Navigation in Discrete Grid-Worlds with Partial Observability
Abstract In this paper we tackle the problem of robot navigation in discrete grid-worlds using memory-based learning schemes. Different memory-based approaches are tested for navigating an agent across a discrete but partially observable world, and the significance of memory structure is examined. Further, the effects of additional memory hierarchies and multi-level learning frameworks are anal...
متن کاملUsing Advice in Model-Based Reinforcement Learning
When a human is mastering a new task, they are usually not limited to exploring the environment, but also avail themselves of advice from other people. In this paper, we consider the use of advice expressed in a formal language to guide exploration in a model-based reinforcement learning algorithm. In contrast to constraints, which can eliminate optimal policies if they are not sound, advice is...
متن کاملProjective simulation applied to the grid-world and the mountain-car problem
We study the model of projective simulation (PS) which is a novel approach to artificial intelligence (AI). Recently it was shown that the PS agent performs well in a number of simple task environments, also when compared to standard models of reinforcement learning (RL). In this paper we study the performance of the PS agent further in more complicated scenarios. To that end we chose two well-...
متن کاملRobot Navigation in Partially Observable Domains using Hierarchical Memory-Based Reinforcement Learning
In this paper, we attempt to find a solution to the problem of robot navigation in a domain with partial observability. The domain is a grid-world with intersecting corridors, where the agent learns an optimal policy for navigation by making use of a hierarchical memory-based learning algorithm. We define a hierarchy of levels over which the agent abstracts the learning process, as well as it...
متن کاملDIVA: A Self Organizing Adaptive World Model for Reinforcement Learning
Reinforcement learning algorithms without an internal world model often su er from overly long time to converge. Mostly the agent has to be successful a several hundred times before it could learn how to behave in even simple environments. In this case, a world model could be useful to reduce the number of real world trials by performing the action virtually in the world model. This may help to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- ITC
دوره 41 شماره
صفحات -
تاریخ انتشار 2012